Just-in-time subgrammar extraction for HPSG

نویسندگان

  • Vlado Kešelj
  • Nick Cercone
چکیده

We define the basic problem of subgrammar extraction for head-driven phrase structure grammars (HPSG) in the following way: Given a large HPSG grammarG and a set of wordsW , find a small subgrammar of G that accepts the same set of sentences fromW asG, and for each of them produces the same parse trees. The set of words W is obtained from a piece of text. Additionally, we assume that this operation is done “justin-time,” i.e., just before parsing the text. This application requires that this operation be done in an automatic and efficient way. After defining the problem in the general framework, we discuss the problem for context-free grammars (CFG), and give an efficient algorithm for it. We show that finding the smallest subgrammar for HPSGs is an NP-hard problem, and give an efficient algorithm that solves an easier, approximate version of the problem. We also discuss how the algorithm can be efficiently implemented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring HPSG-based Treebanks for Probabilistic Parsing HPSG grammar extraction

We describe a method for the automatic extraction of a Stochastic Lexicalized Tree Insertion Grammar from a linguistically rich HPSG Treebank. The extraction method is strongly guided by HPSG–based head and argument decomposition rules. The tree anchors correspond to lexical labels encoding fine–grained information. The approach has been tested with a German corpus achieving a labeled recall of...

متن کامل

Application-driven automatic subgrammar extraction

The space and run-time requirements of broad coverage grammars appear for many applications unreasonably large in relation to the relative simplicity of the task at hand. On the other hand, handcrafted development of application-dependent grammars is in danger of duplicating work which is then di cult to re-use in other contexts of application. To overcome this problem, we present in this paper...

متن کامل

Grammar Extraction and Refinement from an HPSG Corpus

Grammar learning and refinement on the basis of language resources is very appealing in comparison with manual development of formal grammar. But in order to learn a complex grammar a complex resource is needed. Thus the creation of language resources and learning of grammars from them have to be aware of each other. In this paper we define a formal basis for annotation of corpora with respect ...

متن کامل

Exploring HPSG-based Treebanks for Probabilistic Parsing

We describe a method for the automatic extraction of a Stochastic Lexicalized Tree Insertion Grammar from a linguistically rich HPSG Treebank. The extraction method is strongly guided by HPSG–based head and argument decomposition rules. The tree anchors correspond to lexical labels encoding fine–grained information. The approach has been tested with a German corpus achieving a labeled recall of...

متن کامل

Extracting Supertags from HPSG-based Tree Banks

We describe a method for the automatic extraction of a Stochastic Lexicalized Tree Insertion Grammar from a linguistically rich HPSG Treebank. The extraction method is strongly guided by HPSG–based head and argument decomposition rules. The tree anchors correspond to lexical labels encoding fine–grained information. The approach has been tested with a German corpus achieving a labeled recall of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001